智能论文笔记

SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery

Yezhen Cong , Samar Khanna , Chenlin Meng , Patrick Liu , Erik Rozi , Yutong He , Marshall Burke , David B. Lobell , Stefano Ermon

分类：计算机视觉 | 人工智能

2022-07-17

大型视力模型的无监督预训练方法已显示出可以提高下游监督任务的性能。为卫星图像开发类似的技术带来了重要的机会，因为未标记的数据很丰富，并且固有的时间和多光谱结构提供了途径，以进一步改善现有的训练策略。在本文中，我们提出了Satmae，这是基于蒙面自动编码器（MAE）的时间或多光谱卫星图像的预训练框架。为了利用时间信息，我们包括一个时间嵌入以及跨时间独立掩盖图像贴片。此外，我们证明将多光谱数据编码为具有不同光谱位置编码的频段组是有益的。我们的方法在基准数据集（最高$ \ uparrow $ 7 \％）上的监督学习绩效方面都对先前最先前的技术产生了强大的改进，以及在下游遥感任务（包括土地）上的转移学习绩效封面分类（最多$ \ uparrow $ 14 \％）和语义细分。

translated by 谷歌翻译

Secure and Privacy Preserving Proxy Biometrics Identities

Harkeerat Kaur , Rishabh Shukla , Isao Echizen , Pritee Khanna

分类：计算机视觉

2022-12-21

With large-scale adaption to biometric based applications, security and privacy of biometrics is utmost important especially when operating in unsupervised online mode. This work proposes a novel approach for generating new artificial fingerprints also called proxy fingerprints that are natural looking, non-invertible, revocable and privacy preserving. These proxy biometrics can be generated from original ones only with the help of a user-specific key. Instead of using the original fingerprint, these proxy templates can be used anywhere with same convenience. The manuscripts walks through an interesting way in which proxy fingerprints of different types can be generated and how they can be combined with use-specific keys to provide revocability and cancelability in case of compromise. Using the proposed approach a proxy dataset is generated from samples belonging to Anguli fingerprint database. Matching experiments were performed on the new set which is 5 times larger than the original, and it was found that their performance is at par with 0 FAR and 0 FRR in the stolen key, safe key scenarios. Other parameters on revocability and diversity are also analyzed for protection performance.

translated by 谷歌翻译

Tensions Between the Proxies of Human Values in AI

Teresa Datta , Daniel Nissani , Max Cembalest , Akash Khanna , Haley Massa , John P. Dickerson

分类：机器学习 | 人工智能

2022-12-14

Motivated by mitigating potentially harmful impacts of technologies, the AI community has formulated and accepted mathematical definitions for certain pillars of accountability: e.g. privacy, fairness, and model transparency. Yet, we argue this is fundamentally misguided because these definitions are imperfect, siloed constructions of the human values they hope to proxy, while giving the guise that those values are sufficiently embedded in our technologies. Under popularized methods, tensions arise when practitioners attempt to achieve each pillar of fairness, privacy, and transparency in isolation or simultaneously. In this position paper, we push for redirection. We argue that the AI community needs to consider all the consequences of choosing certain formulations of these pillars -- not just the technical incompatibilities, but also the effects within the context of deployment. We point towards sociotechnical research for frameworks for the latter, but push for broader efforts into implementing these in practice.

translated by 谷歌翻译

PIZZA: A new benchmark for complex end-to-end task-oriented parsing

Konstantine Arkoudas , Nicolas Guenon des Mesnards , Melanie Rubino , Sandesh Swamy , Saarthak Khanna , Weiqi Sun , Khan Haidar

分类：自然语言处理 | 机器学习

2022-12-01

Much recent work in task-oriented parsing has focused on finding a middle ground between flat slots and intents, which are inexpressive but easy to annotate, and powerful representations such as the lambda calculus, which are expressive but costly to annotate. This paper continues the exploration of task-oriented parsing by introducing a new dataset for parsing pizza and drink orders, whose semantics cannot be captured by flat slots and intents. We perform an extensive evaluation of deep-learning techniques for task-oriented parsing on this dataset, including different flavors of seq2seq systems and RNNGs. The dataset comes in two main versions, one in a recently introduced utterance-level hierarchical notation that we call TOP, and one whose targets are executable representations (EXR). We demonstrate empirically that training the parser to directly generate EXR notation not only solves the problem of entity resolution in one fell swoop and overcomes a number of expressive limitations of TOP notation, but also results in significantly greater parsing accuracy.

translated by 谷歌翻译

Automatic Crater Shape Retrieval using Unsupervised and Semi-Supervised Systems

Atal Tewari , Vikrant Jain , Nitin Khanna

分类：计算机视觉

2022-11-03

Impact craters are formed due to continuous impacts on the surface of planetary bodies. Most recent deep learning-based crater detection methods treat craters as circular shapes, and less attention is paid to extracting the exact shapes of craters. Extracting precise shapes of the craters can be helpful for many advanced analyses, such as crater formation. This paper proposes a combination of unsupervised non-deep learning and semi-supervised deep learning approach to accurately extract shapes of the craters and detect missing craters from the existing catalog. In unsupervised non-deep learning, we have proposed an adaptive rim extraction algorithm to extract craters' shapes. In this adaptive rim extraction algorithm, we utilized the elevation profiles of DEMs and applied morphological operation on DEM-derived slopes to extract craters' shapes. The extracted shapes of the craters are used in semi-supervised deep learning to get the locations, size, and refined shapes. Further, the extracted shapes of the craters are utilized to improve the estimate of the craters' diameter, depth, and other morphological factors. The craters' shape, estimated diameter, and depth with other morphological factors will be publicly available.

translated by 谷歌翻译

Fragile object transportation by a multi-robot system in an unknown environment using a semi-decentralized control approach

Dibyendu Roy , Sreejeet Maity , Madhubanti Maitra , Samar Bhattacharya

分类：机器人

2022-09-12

在本文中，我们引入了一种半居中的控制技术，用于在不确定的遮挡环境中运送脆弱物体到目的地的一群机器人。建议的方法已分为两部分。初始部分（第1阶段）包括一种集中的控制策略，用于在代理之间创建特定的形成，以便可以将要运输的对象正确放在系统顶部。我们提出了一种与基于圆形区域的形状控制方法融合在一起的新型三角填料方案，用于在机器人之间创建刚性配置。在后面的部分（第2阶段），需要群体系统以采用基于区域的形状控制方法的分散方式将对象传达到目的地。模拟结果以及比较研究证明了我们提出的方案的有效性。

translated by 谷歌翻译

Explainable vision transformer enabled convolutional neural network for plant disease identification: PlantXViT

Poornima Singh Thakur , Pritee Khanna , Tanuja Sheorey , Aparajita Ojha

分类：计算机视觉

2022-07-16

植物疾病是全球作物损失的主要原因，对世界经济产生了影响。为了解决这些问题，智能农业解决方案正在发展，将物联网和机器学习结合起来，以进行早期疾病检测和控制。许多这样的系统使用基于视觉的机器学习方法进行实时疾病检测和诊断。随着深度学习技术的发展，已经出现了新方法，这些方法采用卷积神经网络进行植物性疾病检测和鉴定。基于视觉的深度学习的另一个趋势是使用视觉变压器，事实证明，这些变压器是分类和其他问题的强大模型。但是，很少研究视力变压器以进行植物病理应用。在这项研究中，为植物性疾病鉴定提出了一个启用视觉变压器的卷积神经网络模型。提出的模型将传统卷积神经网络的能力与视觉变压器有效地识别出多种农作物的大量植物疾病。拟议的模型具有轻巧的结构，只有80万个可训练的参数，这使其适合基于物联网的智能农业服务。 PlantXvit的性能在五个公开可用的数据集上进行了评估。拟议的PlantXvit网络在所有五个数据集上的性能要比五种最先进的方法更好。即使在挑战性的背景条件下，识别植物性疾病的平均准确性分别超过了苹果，玉米和稻米数据集的93.55％，92.59％和98.33％。使用梯度加权的类激活图和局部可解释的模型不可思议的解释来评估所提出模型的解释性效率。

translated by 谷歌翻译

Discriminative Kernel Convolution Network for Multi-Label Ophthalmic Disease Detection on Imbalanced Fundus Image Dataset

Amit Bhati , Neha Gour , Pritee Khanna , Aparajita Ojha

分类：计算机视觉

2022-07-16

通过研究视网膜生物结构的进展，可以识别眼病的存在和严重性是可行的。眼底检查是检查眼睛的生物结构和异常的诊断程序。诸如青光眼，糖尿病性视网膜病和白内障等眼科疾病是世界各地视觉障碍的主要原因。眼疾病智能识别（ODIR-5K）是研究人员用于多标签的多份多疾病分类的基准结构底面图像数据集。这项工作提出了一个歧视性内核卷积网络（DKCNET），该网络探讨了歧视区域的特征，而无需增加额外的计算成本。 DKCNET由注意力块组成，然后是挤压和激发（SE）块。注意块从主干网络中获取功能，并生成歧视性特征注意图。 SE块采用区分特征图并改善了通道相互依赖性。使用InceptionResnet骨干网络观察到DKCNET的更好性能，用于具有96.08 AUC，94.28 F1-SCORE和0.81 KAPPA得分的ODIR-5K底面图像的多标签分类。所提出的方法根据诊断关键字将通用目标标签拆分为眼对。基于这些标签，进行了过采样和不足采样以解决阶级失衡。为了检查拟议模型对培训数据的偏见，对ODIR数据集进行了训练的模型将在三个公开可用的基准数据集上进行测试。发现它在完全看不见的底面图像上也具有良好的性能。

translated by 谷歌翻译

RevBiFPN: The Fully Reversible Bidirectional Feature Pyramid Network

Vitaliy Chiley , Vithursan Thangarasa , Abhay Gupta , Anshul Samar , Joel Hestness , Dennis DeCoste

分类：机器学习 | 人工智能 | 计算机视觉

2022-06-28

这项工作介绍了Revsilo，这是双向多尺度特征融合的第一个可逆模块。像其他可逆方法一样，Revsilo消除了通过重新计算来存储隐藏激活的需求。但是，现有的可逆方法不适用于多尺度功能融合，因此不适用于大型网络。双向多尺度功能融合促进了本地和全球连贯性，并已成为针对空间敏感任务的网络的事实上的设计原理，例如hrnet和效率。当与高分辨率输入配对时，这些网络可以在各种计算机视觉任务中获得最新的结果，但是训练它们需要大量的加速器内存来节省大型的多分辨率激活。这些内存需求上限网络大小并限制进度。利用可逆的重新计算，Revsilo可以减轻记忆问题，同时仍在分辨率范围内运行。堆叠Revsilos，我们创建了RevBIFPN，这是一个完全可逆的双向功能金字塔网络。对于分类，RevBIFPN在使用高达19.8倍的训练记忆时与诸如EdgitionNet之类的网络具有竞争力。当对可可进行微调时，RevBIFPN使用更少的MAC和降低训练时间内存的MAC可提供高达2.5％的AP提升。

translated by 谷歌翻译

PInKS: Preconditioned Commonsense Inference with Minimal Supervision

Ehsan Qasemi , Piyush Khanna , Qiang Ning , Muhao Chen

分类：自然语言处理

2022-06-16

诸如“玻璃可以用于饮用水”之类的先决条件的推理仍然是语言模型的开放问题。主要的挑战在于，前提数据的稀缺性以及模型对这种推理的缺乏支持。我们提出了粉红色的，预处理性的推论，并通过弱监督进行了改进的模型，用于通过最低限度的监督来推理前提条件。我们从经验和理论上表明，粉红色改善了基准的结果，该基准的重点是通过常识性知识的前提（高达40％的宏F1分数）进行推理。我们通过Pac-Bayesian信息分析，精确度量和消融研究进一步研究粉红色。

translated by 谷歌翻译